Fix bug: stop/kill swss container will drag down syncd#2242
Fix bug: stop/kill swss container will drag down syncd#2242qiluo-msft wants to merge 1 commit intosonic-net:masterfrom
Conversation
Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
yxieca
left a comment
There was a problem hiding this comment.
Let me test it. We need to make sure following scenario works:
- In non-warm case, when swss stop/start/restart, syncd service needs to follow state change.
- In warm case
- swss service needs to be able to stop/start/restart without incur syncd state change.
- syncd service needs to be able to stop/start/restart without incur swss state change.
With the double attach, in the warm case, we need also test sequence like following (not limited to):
stop swss (will it now stop syncd as it shouldn't?)
start swss (can this be done if swss is still attached to syncd?)
stop syncd (will it cause swss to stop?)
...
I need to think about what needs to be tested and run some test before getting back to you.
|
With second attach, when docker swss is killed, swss service state would stay active. Is this a behavior we want? Personally, I don't think this is the right behavior as a service. Also, having swss attaching to syncd docker adds dependency that we are trying to get rid of. |
|
Also, if for whatever reason the swss service stopped attached to swss docker and attached to syncd, if at this time, a syncd warm restart is performed, swss service will fail and be killed. |
``` 99425a8 (HEAD -> 202205, origin/202205) [actions] Support Semgrep by Github Actions (sonic-net#2417) f41e4d1 Fix for show vxlan tunnel command display issue sonic-net#11902 (sonic-net#2391) e1d827e [VxLAN]Fix Vxlan delete command to throw error when there are references (sonic-net#2404) d77acf8 [doc] add documentation on automatic techsupport based on memory (sonic-net#2411) 2cfc75a [doc] update "config feature" section with "--block" option (sonic-net#2409) 9dc8471 [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (sonic-net#2398) 342589e Added cisco config platform commands (sonic-net#2242) (sonic-net#2418) be7da6b [sonic-installer] use host docker startup arguments when running dockerd in chroot (sonic-net#2179) (sonic-net#2407) d112f7c [202205][auto-ts] add memory check (sonic-net#2116) (sonic-net#2413) ``` Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
linkmgrd: * a5ac7f6 2022-10-05 | [Active-Active] Post link prober stats to state db (sonic-net#140) (HEAD -> 202205, github/202205) [Jing Zhang] * f4b0e53 2022-10-05 | [Active-Active] Retry config mux mode standby (sonic-net#139) [Jing Zhang] utilities: * a255838 2022-10-04 | [minigraph] new workflow for golden path (sonic-net#2396) (HEAD -> 202205, github/202205) [jingwenxie] * 99425a8 2022-10-03 | [actions] Support Semgrep by Github Actions (sonic-net#2417) [Mai Bui] * f41e4d1 2022-09-30 | Fix for show vxlan tunnel command display issue sonic-net#11902 (sonic-net#2391) [Senthil Bhava] * e1d827e 2022-09-29 | [VxLAN]Fix Vxlan delete command to throw error when there are references (sonic-net#2404) [Sudharsan Dhamal Gopalarathnam] * d77acf8 2022-09-28 | [doc] add documentation on automatic techsupport based on memory (sonic-net#2411) [Stepan Blyshchak] * 2cfc75a 2022-09-28 | [doc] update "config feature" section with "--block" option (sonic-net#2409) [Stepan Blyshchak] * 9dc8471 2022-09-28 | [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (sonic-net#2398) [Vivek] * 342589e 2022-10-03 | Added cisco config platform commands (sonic-net#2242) (sonic-net#2418) [yucgu] swss: * 9d9f395 2022-10-04 | [intfmgr]: Enable `accept_untracked_na` kernel param (sonic-net#2436) (HEAD -> 202205, github/202205) [Lawrence Lee] * 6b6d25d 2022-10-04 | [orchdaemon]: Fixed sairedis record file rotation (sonic-net#2480) [Bryan Crossland] Signed-off-by: Ying Xie <ying.xie@microsoft.com>
linkmgrd: * a5ac7f6 2022-10-05 | [Active-Active] Post link prober stats to state db (#140) (HEAD -> 202205, github/202205) [Jing Zhang] * f4b0e53 2022-10-05 | [Active-Active] Retry config mux mode standby (#139) [Jing Zhang] utilities: * a255838 2022-10-04 | [minigraph] new workflow for golden path (#2396) (HEAD -> 202205, github/202205) [jingwenxie] * 99425a8 2022-10-03 | [actions] Support Semgrep by Github Actions (#2417) [Mai Bui] * f41e4d1 2022-09-30 | Fix for show vxlan tunnel command display issue #11902 (#2391) [Senthil Bhava] * e1d827e 2022-09-29 | [VxLAN]Fix Vxlan delete command to throw error when there are references (#2404) [Sudharsan Dhamal Gopalarathnam] * d77acf8 2022-09-28 | [doc] add documentation on automatic techsupport based on memory (#2411) [Stepan Blyshchak] * 2cfc75a 2022-09-28 | [doc] update "config feature" section with "--block" option (#2409) [Stepan Blyshchak] * 9dc8471 2022-09-28 | [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (#2398) [Vivek] * 342589e 2022-10-03 | Added cisco config platform commands (#2242) (#2418) [yucgu] swss: * 9d9f395 2022-10-04 | [intfmgr]: Enable `accept_untracked_na` kernel param (#2436) (HEAD -> 202205, github/202205) [Lawrence Lee] * 6b6d25d 2022-10-04 | [orchdaemon]: Fixed sairedis record file rotation (#2480) [Bryan Crossland] Signed-off-by: Ying Xie <ying.xie@microsoft.com> Signed-off-by: Ying Xie <ying.xie@microsoft.com>
ac71d74 [VxLAN]Fix Vxlan delete command to throw error when there are references (#2404) 7419c67 Added cisco config platform commands (#2242) 8760bbe Add UT to check sonic installer does not depend on database (#2401) 6bef652 [doc] add documentation on automatic techsupport based on memory (#2411) 4a78374 [doc] update "config feature" section with "--block" option (#2409) dd6210f [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (#2398) bdc4a8a Fix broken pipeline build URL (#2363) b31681b Fix display disorder problem of show vrf (#2392) 123504a YANG validation for ConfigDB Updates: portchannel add/remove, loopback interface, VLAN 28f6820 [link-local]Modify RIF check to include link-local enabled interfaces (#2394)
4237794 [muxcable][config] add CLI support for mux mode detach (sonic-net#2425) a817896 YANG validation for ConfigDB Updates: MGMT_INTERFACE, PORTCHANNEL_MEMBER use cases (sonic-net#2420) 81e2aec [minigraph] new workflow for golden path (sonic-net#2396) c1206aa ConfigDB Updates with YANG Validation: Include potential for YANG validation even when adhoc validation is used (sonic-net#2412) 57c509a [show] vnet endpoint [ip/ipv6] command (sonic-net#2342) 4b2b766 [actions] Support Semgrep by Github Actions (sonic-net#2417) 156257e check for vxlan mapping before removing vlan (sonic-net#2388) cb0edd3 Fix for show vxlan tunnel command display issue sonic-net#11902 (sonic-net#2391) ac71d74 [VxLAN]Fix Vxlan delete command to throw error when there are references (sonic-net#2404) 7419c67 Added cisco config platform commands (sonic-net#2242) 8760bbe Add UT to check sonic installer does not depend on database (sonic-net#2401) 6bef652 [doc] add documentation on automatic techsupport based on memory (sonic-net#2411) 4a78374 [doc] update "config feature" section with "--block" option (sonic-net#2409) dd6210f [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (sonic-net#2398) bdc4a8a Fix broken pipeline build URL (sonic-net#2363) b31681b Fix display disorder problem of show vrf (sonic-net#2392) 123504a YANG validation for ConfigDB Updates: portchannel add/remove, loopback interface, VLAN 28f6820 [link-local]Modify RIF check to include link-local enabled interfaces (sonic-net#2394)
What I did Add cisco sub-command option under 'config platform' command How I did it In config/main.py, check the platform type and import the cisco.py file under cisco platform code when it's cisco-8000. How to verify it Run config platform -h to see all commands. We will be able to see config platform cisco. This is only available on cisco devices. Signed-off-by: Yucai Gu yucgu@cisco.com
Signed-off-by: Qi Luo qiluo-msft@users.noreply.github.com
- What I did
The swss service include two docker container: swss and syncd. In below situations,
We should treat the service as 'active (running)' state. Currently it will drag down syncd container, and mark the service as 'failed (Result: exit-code)' state with '(code=exited, status=137)'.
PS. 137 is the killing return value.
ref: http://tldp.org/LDP/abs/html/exitcodes.html
- How I did it
- How to verify it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)